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Abstract 

We present a novel method for communicating between 
a camera and display by embedding and recovering hidden 
and dynamic information within a displayed image. A hand¬ 
held camera pointed at the display can receive not only the 
display image, but also the underlying message. These ac¬ 
tive scenes are fundamentally different from traditional pas¬ 
sive scenes like QR codes because image formation is based 
on display emittance, not surface reflectance. Detecting and 
decoding the message requires careful photometric model¬ 
ing for computational message recovery. Unlike standard 
watermarking and steganography methods that lie outside 
the domain of computer vision, our message recovery al¬ 
gorithm uses illumination to optically communicate hidden 
messages in real world scenes. The key innovation of our 
approach is an algorithm that performs simultaneous ra¬ 
diometric calibration and message recovery in one convex 
optimization problem. By modeling the photometry of the 
system using a camera-display transfer function (CDTF), 
we derive a physics-based kernel function for support vec¬ 
tor machine classification. We demonstrate that our method 
of optimal online radiometric calibration (OORC) leads to 
an efficient and robust algorithm for computational messag¬ 
ing between nine commercial cameras and displays. 
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Figure 1. The above flowchart illustrates the process by which 
Online Radiometric Calibration is used to estimate and negate 
the light-altering effects of the Camera-Display Transfer Func¬ 
tion (CDTF) in camera-display communication. Variables such 
as camera pose, photometry, and hardware all have a signiflcant 
effect on light signals passing from electronic display to camera. 
In each pair of intensity histograms shown above, the left repre¬ 
sents an image’s histogram before passing through the CDTF, and 
the right represents the histogram after the CDTF. Online Radio- 
metric Calibration mitigates the distorting effects of the CDTF to 
preserve the image’s histogram, enabling more accurate image re¬ 
covery. 


1. Introduction 

While traditional computer vision concentrates on ob¬ 
jects that reflect environment lighting (passive scenes), ob¬ 
jects which emit light, such as electronic displays, are 
increasingly common in modern scenes. Unlike passive 
scenes, active scenes can have intentional information that 
must be detected and recovered. For example, displays with 
QR codes [13] can be found in numerous locations such as 
shop windows and billboards. However, QR-codes are very 


simple examples because the bold, static pattern makes de¬ 
tection somewhat trivial. The problem is more challenging 
from a computer vision point of view when the codes are 
not visible markers, but rather are hidden within a displayed 
image. The displayed image is a light field, and decoding 
the message is an interesting problem in photometric mod¬ 
eling and computational photography. The paradigm has 
numerous applications because the electronic display and 
the camera can act as a communication channel where the 
display pixels are transmitters and the camera pixels are re¬ 
ceivers [2][1][31]. Unlike hidden messaging in the digi- 
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Figure 2. Image Formation Pipeline: The image Id is displayed by 
an electronic display with an emittance function e. The display is 
observed by a camera with sensitivity s and radiometric response 
function /. 
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tal domain, prior work in real-world camera-display mes¬ 
saging is very limited. In this paper, we develop an opti¬ 
mal method for sending and retrieving hidden time-varying 
messages using electronic displays and cameras which ac¬ 
counts for the the characteristics of light emittance from the 
display. We assume the electronic display has two simul¬ 
taneous purposes: 1) the original display function such as 
advertising, maps, slides, or artwork; 2) the transmission of 
hidden time-varying messages. 

When light is emitted from a display, the resultant 3D 
light field has an intensity that depends on the angle of ob¬ 
servation as well as the pixel value controlled by the dis¬ 
play. The emittance function of the electronic display is 
analogous to the BRDF (bidirectional refiectance distribu¬ 
tion function) of a surface. This function characterizes the 
light radiating from a display pixel. It has a particular spec¬ 
tral shape that does not match the spectral sensitivity curve 
of the camera. The effects of the display emittance function, 
the spectral sensitivity of the camera and the effect of cam¬ 
era viewing angle are all components of our photometric 
model for image formation as shown in Figure 2. Our ap¬ 
proach does not require measurement or knowledge of the 
exact display emittance function. Instead, we measure the 
entire system transfer function, as a camera-display trans¬ 
fer function (CDTF), which determines the captured pixel 
value as a function of the displayed pixel value. By using 
frame-to-frame characterization of the CDTF, the method is 
independent of the particular choice of display and camera. 

Interestingly, while our overall goal has very strong sim¬ 
ilarities to the field of watermarking and steganography, 
we present results that are novel and are aligned with the 
goals of computational photography. Although watermark¬ 
ing literature has many hidden messaging methods, this area 
largely ignores the physics of illumination. Display-camera 
messaging is fundamentally different from watermarking 
because each pixel of the image is a light source that propa¬ 
gates in free space. Therefore, representations and methods 
that act only in the digital domain are not sufficient. 

The problem of understanding the relationship between 
the displayed pixel and the captured pixel is closely related 
to the area of radiometric calibration [22] [6] [24] . In these 
methods, a brightness transfer function characterizes the re¬ 
lationship between scene radiance and image pixel values. 


The characterization of this function is done by measuring 
a range of scene radiances and the corresponding capture 
images pixels. Our problem in camera-display messaging 
is similar but has important key differences. The CDTF 
is more complex than standard radiometric calibration be¬ 
cause the system consists of both a display and a camera, 
each device adding its own nonlinearities. We can exploit 
the control of pixel intensities on the display and easily cap¬ 
ture the full range of input intensities. However, the dis¬ 
play emittance function is typically dependent on the dis¬ 
play viewing angle. Therefore, the CDTF is dependent on 
camera pose. In a moving camera system, the CDTF must 
be estimated per frame; that is, an online CDTF estimation 
is needed. Furthermore, this function varies spatially over 
the electronic display surface. 

We show that the two-part problem of online radiomet¬ 
ric calibration and accurate message retrieval can be struc¬ 
tured as an optimization problem. This leads to the pri¬ 
mary contribution of the paper. We present an elegant prob¬ 
lem formulation where the photometric modeling leads to 
physically-motivated kernel functions that are used with a 
support vector machine classifier. We show that calibration 
and message bit classification can be done simultaneously 
and the resulting optimization algorithm operates in four di¬ 
mensional space and is convex. The algorithm is a novel 
method for online optimal radiometric calibration (OORC) 
that enables accurate camera-display messaging. An exam¬ 
ple message recovery result is shown in Figure 3. Our ex¬ 
perimental results show that accuracy levels for message re¬ 
covery can improve from as low as 40-60% to higher than 
90% using our approach when compared to either no cali¬ 
bration, or sequential calibration followed message recov¬ 
ery. For evaluation of results, 9 different combinations of 
displays and cameras are used with 15 different image se¬ 
quences, for multiple embedded intensity values, and mul¬ 
tiple camera-display view angles. 

The standard problem of radiometric calibration is 
solved by varying exposure so that a range of scene radiance 
can be measured. For CDTF estimation, textured patches 
are placed within the display image that have intensity vari¬ 
ation over the full range of display brightness values. These 
patches can be placed in inconspicuous regions of the dis¬ 
play image or in comers. We use the term ratex patch to 
refer to these radiometric calibration texture patches. The 
ratex patches are not used as part of the hidden message. 
Multiple ratex patches can be used to find a spatially vary¬ 
ing CDTF. The ratex patches have the advantage that they 
are perceptually acceptable, they represent the entire range 
of gray-scale intensity variation, and they can be distributed 
spatially. Furthermore, these patches are used for support 
vector machine training as described in Section 4. 

Additionally, we introduce a method of radiometric cali¬ 
bration that employs visually non-dismptive “hidden ratex” 
mapping. Rather than directly measuring the effect that the 















(a) Difference image (b) Thresholding (c) Our method 

Figure 3. Comparison of message recovery with a naive method and the proposed optimal method (a) Difference of two consecutive frames 
in the captured sequence to reveal the transmitted message, (b) Naive method: Threshold the difference image by a constant (threshold 
T = 5 for this example), (c) Optimal Method: Bits are classified by a simultaneous radiometric calibration and support vector machine 
classifier. 


CDTF has on known intensity values, we are able to model 
the CDTF based on changes to a known frequency distribu¬ 
tion of intensity values. Radiometric calibration with hid¬ 
den ratex produces a distribution-driven intensity mapping 
that mitigates the photometric effects of the CDTF for sim¬ 
ple message recovery. 

The contributions of the paper can be summarized as fol¬ 
lows: 1) A new optimal online radiometric calibration with 
simultaneous message recovery, cast as a convex optimiza¬ 
tion problem; 2) photometric model of the camera display 
transfer function; 3) the use of ratex patches to provide con¬ 
tinual calibration information as a practical method for on¬ 
line calibration; 4) the use of distribution-driven intensity 
mapping as a practical method for visually non-disruptive 
online calibration. 

2. Related Work 

Watermarking In developing a system where cameras 
and displays can communicate under real world conditions, 
the initial expectation was that existing watermarking tech¬ 
niques could be used directly. Certainly the work in this 
field is extensive and has a long history with numerous sur¬ 
veys compiled [4] [35] [28] [5] [14] [27]. Surprisingly, ex¬ 
isting methods are not directly applicable to our problem. 
In the field of watermarking, a fixed image or mark is em¬ 
bedded in an image often with the goal of identifying fraud¬ 
ulent copies of a video, image or document. Existing work 
emphasizes almost exclusively the digital domain and does 
not account for the effect of illumination in the image for¬ 
mation process in real world scenes. In the digital domain, 
neglecting the physics of illumination is quite reasonable; 
however, for camera-display messaging, illumination plays 
a central role. 

From a computer vision point of view, the imaging pro¬ 
cess can be divided into two main components: photometry 
and geometry. The geometric aspects of image formation 
have been addressed to some extent in the watermarking 
community, and many techniques have been developed for 


robustness to geometric changes during the imaging pro¬ 
cess such as scaling, rotations, translations and general ho- 
mography transformations [7] [29] [8] [34] [19] [28] [30]. 
However, the photometry of imaging has largely been ig¬ 
nored. The rare mention of photometric effects [40] [37] in 
the watermarking literature doesn’t define photometry with 
respect to illumination; instead photometric effects are de¬ 
fined as “lossy compression, denoising, noise addition and 
lowpass filtering”. In fact, photometric attacks are some¬ 
times defined as jpeg compression [8]. 

Radiometric Calibration Ideally, we consider the pixel- 
values in a camera image to be a measurement of light in¬ 
cident on the image plane sensor. It is well known that the 
relationship is typically nonlinear. Radiometric calibration 
methods have been developed to estimate the camera re¬ 
sponse function that converts irradiance to pixel values. In 
measuring a camera response, a series of known brightness 
values are measured along with the corresponding pixel val¬ 
ues. In general, having such ground truth brightness is quite 
difficult. The classic method [6] uses multiple exposure val¬ 
ues instead. The light intensity on the sensor is a linear 
function of the time of exposure, so known exposure times 
enables ground truth light intensity. This exposure-based 
method is used in several radiometric calibration methods 
[22] [24] [6] [21] [17]. Our goal for the display-camera 
system is related to radiometric calibration, yet different in 
significant ways. We are interested not just in a system that 
converts scene radiance to pixels (the camera), but also con¬ 
verts from pixel to scene radiance (the display) so that the 
whole camera-display system is a function that maps a color 
value at the display to a color value at the camera. 

The camera response in radiometric calibration is either 
estimated as a full mapping where iout is specified for ev¬ 
ery iin or as an analytic function g{iin)- Several authors 
[22] [3] [18] use polynomials to model the radiometric re¬ 
sponse function. Similarly, we have found that fourth order 
polynomials can be used for modeling the inverse display- 


camera transfer function. The dependence on color is typ¬ 
ically modeled by considering each channel independently 
[22] [24] [6] [9] . Interestingly, although more complex 
color models have been developed [16] [20] [36], we have 
found the independent channel approach suitable for the 
display-camera representation where the optimality crite¬ 
rion is accurate message recovery. 

Existing radiometric calibration methods are developed 
for cameras, not camera-display systems. Therefore, dis¬ 
play emittance function is not part of the system to be cal¬ 
ibrated. However, for the camera-display transfer function, 
this component plays an important role. We do not use the 
measured display emittance function explicitly, but since 
the CDTF is view dependent and the camera can move, our 
approach is to perform radiometric calibration per frame, 
by the insertion of radiometric calibration patches (ratex 
patches). 


analyzing factors that commonly influence the CDTF. 

3.1. Display Emittance Variation 

Displays vary widely in brightness, hue, white balance, 
contrast and many other parameters that will influence the 
appearance of light. To affirm this hypothesis, an SLR cam¬ 
era with fixed parameters observes 3 displays and models 
the CDTF for each one. See Samsung in Fig. 4(a), LG in 
Fig. 4(b), and iMac 4(c). Although each display is tuned 
to the same parameters, including contrast and RGB values, 
each display produces a unique CDTF. 



Other Methods for Camera-Display Communication 

Camera-display communications have precedent in the 
computer vision community, but existing methods differ 
from our proposed approach. For example, researchers on 
the Bokode project [23] presented a system using an invisi¬ 
ble message, however the message is a fixed symbol, not a 
time-varying message. Invisible QR codes were addressed 
in [15], but these QR-codes are fixed. Similarly, traditional 
watermark approaches typically contained fixed messages. 
LCD-camera communications is presented in [25] with a 
time-varying message, but the camera is in a fixed position 
with respect to the display. Consequently, the electronic 
display is not detected, tracked or segmented from the back¬ 
ground. Furthermore, the transmitted signal is not hidden in 
this work. Recent work has been done in high speed visi¬ 
ble light communications [32], but this work does not uti¬ 
lize existing displays and cameras and requires specialized 
hardware and LED devices. Time-of-fiight cameras have re¬ 
cently been used for phase-based communication [39], but 
these methods require special hardware. Interest in camera- 
display messaging is also shared in the mobile communi¬ 
cations domain. COBRA, RDCode, and Strata have de¬ 
veloped 2D barcode schemes designed to address the chal¬ 
lenges of low-resolution and slow shutter speeds typically 
present in smartphone cameras [10] [33] [12]. Likewise, 
Lightsync has targeted synchronization challenges with low 
frequency cameras. [11]. 

3. System Properties 

In our proposed camera-display communication system, 
pixel values from the display are inputs, while captured in¬ 
tensities from the camera are output. We denote the map¬ 
ping from displayed intensities to captured ones as Camera- 
Display Transfer Function (CDTF). In this section, we mo¬ 
tivate the need for online radiometric calibration by briefly 



(a) Samsung (b) LG (c) iMac 

Figure 4. Variance of Light Output among Displays. An SLR 

camera captured a range of grayscale [0,255] intensity values pro¬ 
duced by 3 different LCDs. These 3 CDTF curves highlight the 
dramatic difference in the light emmitance function for different 
displays, particularly the LG. 


3.2. Observation Angles 

Displays do not emit light in all directions with the same 
power level. Therefore CDTF is also sensitive to observa¬ 
tion angles. To verify this hypothesis, an experiment was 
performed where an SLR camera captured the light inten¬ 
sity produced by a computer display from multiple angles. 
The results in Fig. 5 show that more oblique observation an¬ 
gles yield lower captured pixels intensities. Moreover, there 
is a nonlinear relationship between captured light intensity 
and viewing angle. 

4. Methods 

4.1. Photometry of Display-Camera systems 

The captured image Ic from the camera viewing the elec¬ 
tronic display image Id can be modeled using the image 
formation pipeline in Figure 2. First, consider a particular 
pixel within the display image Id with red, blue and green 
components given by p = (pr^pg^pb)- The captured image 
Ic at the camera has three color components (/^,/f,/^), 
however there is no one-to-one correspondence between the 
color channels of the camera sensitivity function and the 
electronic display emittance function. When the monitor 
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(a) 30 ° (b) 45 ° (c) 60 ° 

Figure 5. Influence of observation angles. Using the Nikon- 
Samsung pair, a range of grayscale [0, 255] values were displayed 
and captured from a set of different observation angles. As ob¬ 
servation angle became more oblique, the captured light intensity 
sharply decreased. Therefore, observation angle has a dramatic, 
nonlinear effect on CDTF. 
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(a) 30 ° (b) 45 ° (c) 60 ° 

Figure 6. Histograms of intensities across the display. Notice as 
observation angle changes, so does the frequency distribution of 
captured intensities. If the intensity distribution (histogram) of the 
displayed image was known, an observer can estimate the CDTF. 


displays the value pg^ph) at a pixel, it is emitting light 
in a manner governed by its emittance function and mod¬ 
ulated by p. The monitor emittance function e is typically 
a function of the viewing angle 0 = {Oy^(j)y) comprised of 
a polar and azimuthal component. For example, the emit¬ 
tance function of an LCD monitor has a large decrease in 
intensity with polar angle (see Figure 6). 

The emittance function has three components, i.e. e = 
(e^, e^, e^). Therefore the emitted light / as a function of 
wavelength A for a given pixel (x, y) on the electronic dis¬ 
play is given by 

I{x, y, A) = Prer{\, 0) + Pgeg{X, 0) + Pbei,{\, 0), (1) 

or 

= p-e{X,0). (2) 

Now consider the intensity of the light received by one pixel 
element at the camera sensor. Let ^^(A) denote the camera 


sensitivity function for the red component, then the red pixel 
value II can be expressed as 

j^[p-e{X,0))]sr{X)dX. (3) 

Notice that the sensitivity function of the camera has a de¬ 
pendence on wavelength that is likely different than the 
emittance function of the monitor. That is, the interpretation 
of “red” in the monitor is different from that of the camera. 
Notice that a sign of proportionality is used in Equation 3 
to specify that the pixel value is a linear function of the in¬ 
tensity at the sensor, assuming a linear camera and display. 
This assumption will be removed in Section 4.3. 

Equation 3 can be written to consider ah color compo¬ 
nents in the captured image Ic as 

IcOc^[p-e(A,0)]s(A)dA. (4) 

where s = s^). 

4.2. Message Structure 

The pixel value p is controllable by the electronic dis¬ 
play driver, and so it provides a mechanism for embedding 
information. We use two sequential frames in our approach. 
We modify the monitor intensity by adding the value n 
and transmit two consecutive images, one with the added 
value le and one image of original intensity Iq. The re¬ 
covered message depends on the display emittance function 
and camera sensitivity function if the embedded message is 
done by adding as follows: 

le oc y* [(/i: + p) • e(A, 6>)] s(A)(iA. (5) 

Recovery of the embedded signal leads to a difference equa¬ 
tion 

le - lo OC ^ [(k) • e(A, 0)] s{X)dX. (6) 

The dependence on the properties of the display e and 
the spectral sensitivity of the camera s remains. We use 
additive-based messaging, instead of ratio-based methods, 
because this structure is convenient for convexity of the al¬ 
gorithm as described in Section 4.3. 

The main concept for message embedding is illustrated 
in Eigure 7. In order to convey many “bits” per image, we 
divide the image region into a series of block components. 
Each block can convey a bit “1” or “0”. The blocks cor¬ 
responding to a “1” contain the added value typically set 
to 10 gray levels, while the zero blocks have no additive 
component = 0). The message is recovered by sending 
the original frame followed by a frame with the embedded 
message and using the difference for message recovery. The 
message can also be added to the coarser scales of a image 
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Figure 7. Message Embedding and Retrieval. Two sequential frames are sent, an original frame and a frame with an embedded message 
image. Simple differencing is not sufficient for message retrieval. Our method (OORC) is used to recover messages accurately. 


pyramid decomposition [26], in order to better hide the mes¬ 
sage within the display image content. The display can be 
tracked with existing methods [38]. This message structure 
is decidedly very simple, so the methods presented here can 
be applied to many message coding schemes. 

When accounting for the nonlinearity in the camera and 
display, we rewrite Equation 4 to include the radiometric 
response function /, 

lo = f(^l^[p-e{X,e)]s{X)dXy (7) 

More concisely, 

Ic = / (Id), (8) 

and the recovered display intensity is 

id = r'(id) = fl(id). (9) 

We use polynomials to represent the radiometric inverse 
function g{i). The same inverse function g is used for all 
color channels and gray-scale ratex patches. This simplifi¬ 
cation of the color problem is justified by the accuracy of 
the empirical results. 

4.3. Simultaneous Radiometric Calibration and 
Message Recovery via Convex Optimization 

The two goals of message recovery and calibration can 
be combined to a single problem. While ideal radiometric 
calibration would provide a captured image that is a linear 
function of the displayed image, we show that calibrating 
followed by message recovery only gives a relatively small 
increase in message accuracy. However, if the two goals are 
combined into a simultaneous problem we have two ben¬ 
efits: 1) the problem formulation can be done in a convex 
optimization paradigm with a single global solution and 2) 
the accuracy increases significantly. 


Let g{i) be the inverse function that is modeled with a 
fourth order polynomial as follows 

g{i^ — CL^i^ H“ ci^i^ “h ci2^^ “b clq. (10) 

Consider two images frames io, where io is the original 
frame and ig the image frame with the embedded message. 
Note the use of io instead of Iq for notional compactness. 
Since we are using an additive message embedding, we 
wish to classify the message bits as either ones or zeros 
based on the difference image io — ie- In order to classify 
the message bits, the ratex patches are also used for train¬ 
ing. Consecutive frames of ratex patches toggle between 
message bit “1” (k, = 10) and message bit “0” (k, = 0). 
This training data can be used for a support vector machine 
(SVM) classifer. 

Taking into account the radiometric calibration, we want 
to classify on the recovered data g{io) — 9{ie)- Assuming 
that the inverse function can be modeled by a fourth order 
polynomial, the function to be classified is 

9(^0) -g(ie) = 

«4(«o - *e) + «3(io - *e) + “ 2(^0 “ «e) + “ 4)- 

( 11 ) 

In Equation 11, we see that the calibration problem has 
a physically motivated nonlinear mapping function. That 
is, we see that the original data (ig, ig) can be placed into 
a higher dimensional space using the nonlinear mapping 
function T> which maps from a two dimensional space to 
a four dimensional space as follows 

ie) =[ {it- it) (ii - il) {i^o - i!) (io - ie) ] ■ 

( 12 ) 

In this four dimensional space we seek a separating hyper¬ 
plane between the two classes (one-bits and zero-bits). Our 
experimental results indicate that these are not separable in 
lower dimensional space, but the movement to a higher di¬ 
mensional space enables the separation. Also, the form of 













that higher dimensional space is physically motivated by the 
need for radiometric calibration. Therefore our problem be¬ 
comes a support vector machine classifier where the support 
vector weights and the calibration parameters are simultane¬ 
ously estimated. That is, we estimate 

w^u-\-h^ (13) 

where, u; G R^, 6 are the separating hyperplane parameters, 
and u is the input feature vector. Since we want to perform 
radiometric calibration, the four-dimensional input is given 

U=[ 04(^0 -«e) a 3 (io-*e) “2(^0 -«e) a{io - Q ^ ■ 

(14) 

Notice that the -h 6 is still linear in the coefficients 
of the inverse radiometric function. These coefficients and 
the scale factor w are estimated simultaneously. We arrive 
at the important observation that accounting for the CDTF 
preserves the convexity of the overall classification prob¬ 
lem. The coefficients of the function g are scaled by w, 
so that calibration and classification can be done simultane¬ 
ously, and convexity of the SVM is preserved. 

4.4. Radiometric Calibration with Hidden Ratex 

The main disadvantage of OORC is the requirement 
that visible ratex patches must be placed on screen. Ra¬ 
tex patches are somewhat visually obtrusive and unattrac¬ 
tive for certain applications. However, they are convenient 
for modeling the CDTF. Instead of directly observing the 
effects of the CDTF on the full intensity gamut, we can 
observe how the CDTF modifies the intensity histogram. 

For this to work, we need to know the initial intensity dis¬ 
tribution of an image before it passes through the CDTF. 

We perform an intensity mapping on every image entering 
the camera-display transfer function so the intensity his¬ 
togram is known. We can think of the known intensity map¬ 
ping of these images as “hidden ratex.” Once the image is 
camera-captured, the new, modified distribution of the im¬ 
age’s intensities are observed. Since the intensity distribu¬ 
tion is predetermined, we are able to measure the effects 
of the CDTF by observing the differences in the camera- 
captured intensity histogram. For example, we may wish to 
choose a uniform, or near uniform intensity distribution for 
camera-display transfer images. By histogram equalizing a 
displayed image, a receiver can infer that the distribution 
of this image’s intensities are near uniform. An intensity 
mapping is applied to an image before it is displayed. Al¬ 
though this will have an effect on the appearance of the car¬ 
rier image, we refer to this method as hidden ratex because 
it does not require markers to be displayed on screen for 
calibration. Once the image is captured, the photometric 
effects of the CDTF has altered the image. The captured 
image is then intensity mapped again, so that its intensity 
histogram is more similar to the displayed distribution, be¬ 


fore distortion by the CDTF. In other words, histogram in¬ 
tensity mapping acts as the inverse CDTF. Although there is 
not one-to-one correspondence, intensity mapping is an ef¬ 
fective method for hidden ratex as a visibly non-disruptive 
method for radiometric calibration. 

Because histogram-driven intensity mapping serves as 
an effective inverse-CDTF mapping, embedded messages 
bits can be labeled with simple thresholding. For each 
pair of corrected images (original and embedded), intensity 
mapping is applied to the original image. That same map¬ 
ping is then applied to the embedded message. The differ¬ 
ence between the original and carrier image are then com¬ 
puted. The embedded blocks are now separable by a simple 
constant threshold, because, undisrupted by the photomet¬ 
ric effects of the CDTF, message blocks are nothing more 
than a known added constant. In other words, 4 and io are 
remapped via the same intensity mapping. The remapped 
difference 4 — is used to recover the message bit. 

5. Results 

For empirical validation, 9 different combination of dis¬ 
plays and cameras are used comprised of 3 displays: 1) LG 
M3204CCBA 32 inch, 2) Samsung SyncMaster 2494SW, 
3) iMac (21.5 inch 2009); and 3 cameras: 1) Canon EOS 
Rebel XSi, 2) Nikon D70, 3) Sony DSC-RXIOO. Fifteen 
8-bit display images are used. From each display image, 
we create a display video of 10 frames: 5 frames with the 
original display images interleaved with 5 images of embed¬ 
ded time-varying messages. An embedded message frame 
is followed by an original image frame to provide the tem¬ 
poral image pair and io. The display image does not 
change in the video, only the bits of the message frames. 
Each message frame has 8 x 8 = 64 blocks used for mes¬ 
sage bits (with 5 bits used for ratex patches for calibration 
and classification training data). Considering 5 display im¬ 
ages, with 5 message frames and 59 bits per frame results 
in approximately 1500 message bits. The accuracy for each 
video is defined as the number of correctly classified bits 
divided by the total bits embedded and is averaged over all 
testing videos. The entire test set over all display-camera 
combinations is approximately 18,000 test bits. 

There are 4 methods for embedded message recovery. 
Method 1 has no radiometric calibration, only the differ¬ 
ence ie — io is used to recover the message bit. Method 2 
is calibration followed by differencing for message recov¬ 
ery. Method 3 (OORC) is the optimal calibration where 
both radiometric calibration and classification are done si¬ 
multaneously. Method 4 is calibration via hidden ratex fol¬ 
lowed by simple differencing for message recovery. For 
the first three methods, training data from pixels in the ra¬ 
tex patches are used to train an SVM classifier. For each 
of the 9 display-camera combinations, the accuracy of the 
4 message recovery methods was tested with 2 sets of ex- 



perimental variables: 1) 0°frontal camera-display view; 2) 
45°oblique camera-display view; and: 1) embedded mes¬ 
sage intensity difference of 5; 2) embedded message inten¬ 
sity difference of 3. The results of these tests are can be 
found in Tables 1, 2, 3, and 4. 


Accuracy 

(%) 

Naive 

Threshold 

Two- 

step 

OORC 

Hidden 

Ratex 

Canon- 

iMac 

72.94 

75.67 

99.17 

89.63 

Canon- 

LG 

58.94 

84.94 

98.44 

95.74 

Canon- 

Samsung 

48.44 

64.89 

99.39 

89.91 

Nikon- 

iMac 

60.17 

75.50 

95.17 

90.00 

Nikon-LG 

49.72 

73.39 

99.33 

94.81 

Nikon- 

Samsung 

47.22 

72.89 

95.00 

89.54 

Sony- 

iMac 

64.44 

76.00 

99.06 

71.11 

Sony-LG 

56.11 

75.61 

98.56 

90.93 

Sony- 

Samsung 

47.50 

79.11 

98.89 

87.80 

Average 

56.17 

75.33 

98.11 

88.83 


Table 1. Accuracy of embedded message recovery and label¬ 
ing with additive difference +3 on [0,255] and captured with 
45° oblique perspective. 


Accuracy 

(%) 

Naive 

Threshold 

Two- 

Step 

OORC 

Hidden 

Ratex 

Canon- 

iMac 

85.56 

83.06 

96.44 

91.57 

Canon- 

LG 

86.39 

90.94 

98.67 

94.07 

Canon- 

Samsung 

87.94 

87.78 

98.94 

91.30 

Nikon- 

iMac 

84.06 

84.00 

96.50 

90.27 

Nikon-LG 

74.67 

81.44 

99.94 

90.09 

Nikon- 

Samsung 

77.33 

86.06 

98.00 

91.57 

Sony- 

iMac 

89.33 

84.22 

99.44 

70.00 

Sony-LG 

87.61 

95.39 

99.72 

80.74 

Sony- 

Samsung 

80.00 

83.78 

96.26 

84.54 

Average 

83.56 

86.30 

98.22 

87.13 


Table 2. Accuracy of embedded message recovery and labeling 
with additive difference +3 on [0,255] and captured at 0°frontal 
view. 


Accuracy 

(%) 

Naive 

Threshold 

Two- 

Step 

OORC 

Hidden 

Ratex 

Canon- 

iMac 

97.06 

94.50 

99.83 

95.37 

Canon- 

LG 

87.89 

99.00 

99.39 

99.44 

Canon- 

Samsung 

71.67 

88.11 

100.00 

95.37 

Nikon- 

iMac 

91.89 

93.67 

96.00 

96.11 

Nikon-LG 

81.56 

95.11 

99.94 

98.88 

Nikon- 

Samsung 

58.78 

92.22 

99.39 

97.41 

Sony- 

iMac 

92.28 

92.00 

99.72 

80.37 

Sony-LG 

77.06 

96.22 

100.00 

91.13 

Sony- 

Samsung 

63.28 

94.17 

99.89 

81.67 

Average 

80.16 

93.89 

99.35 

93.71 


Table 3. Accuracy of embedded message recovery and label¬ 
ing with additive difference +5 on [0,255] and captured with 
45°oblique perspective. 


Accuracy 

(%) 

Naive 

Threshold 

Two- 

step 

OORC 

Hidden 

Ratex 

Canon- 

iMac 

95.28 

96.61 

99.00 

95.74 

Canon- 

LG 

97.11 

99.72 

97.17 

97.59 

Canon- 

Samsung 

97.39 

97.33 

98.94 

94.35 

Nikon- 

iMac 

98.39 

99.17 

99.22 

96.11 

Nikon-LG 

99.83 

100.00 

99.83 

97.31 

Nikon- 

Samsung 

96.33 

97.44 

98.56 

95.74 

Sony- 

iMac 

97.72 

97.00 

99.94 

81.67 

Sony-LG 

99.39 

100.00 

100.00 

90.74 

Sony- 

Samsung 

92.50 

92.33 

98.06 

90.28 

Average 

97.10 

97.73 

98.97 

93.28 


Table 4. Accuracy of embedded message recovery and labeling 
with additive difference +5 on [0,255] and captured at 0°frontal 
view. 


6. Discussion and Conclusion 

The results indicate a substantial improvement of bit 
classification in a camera-display messaging system with 
our methods. We demonstrate experimental results for 
nine different camera-display combinations at frontal and 
oblique viewing directions. We show that naive threshold- 






























































ing is a poor choice because the variation of display in¬ 
tensity with camera position is ignored. Any method that 
embeds a message without accounting for the variation of 
display intensity will degrade for non-frontal views. We 
present two ways to perform online radiometric calibration. 
The first method uses calibration information in the image 
called ratex patches. In the second approach, the calibration 
information is hidden and no patches appear in the image. 
Our experimental results show that hidden, dynamic mes¬ 
sages can be embedded in a display image and recovered 
robustly. We show that naive methods of message embed¬ 
ding without photometric modeling lead to failed message 
recovery, especially for oblique views (45°) and small in¬ 
tensity messages (-1-3). We present a visually non-disruptive 
method for radiometric calibration in the form of hidden 
ratex intensity mapping. Although the CDTF is spatially 
dependent, a single set of calibration coefficients per frame 
were sufficient for high message accuracy. The approach is 
well-justified by theory and by empirical evaluation. 
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